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Abstract. For a bivariate time series {{Xi, Fi))i=i,..,n '^^ want to detect whether the cor- 
relation between Xi and Yi stays constant for all z = 1, . . . , n. We propose a nonparametric 
change-point test statistic based on Kendall's tau. The asymptotic distribution under the 
null hypothesis of no change follows from a new [/-statistic invariance principle. Assuming a 
single change-point, we show that the location of the change-point is consistently estimated. 

Kendall's tau possesses a high efficiency at the normal distribution, as compared to the 
normal maximum likelihood estimator, Pearson's moment correlation. Contrary to Pearson's 
correlation coefficient, it has excellent robustness properties and shows no loss in efficiency 
at heavy-tailed distributions. The motivation for this research article originates in financial 
data situations, where heavy tails are common and Kendall's tau is a more efficient estimator 
than the moment correlation. 

We assume the data {{Xi,Yi))i=i^,,,^n to be stationary and P-near epoch dependent on 
an absolutely regular process. This large class of processes includes all common time series 
models as well as many chaotic dynamical systems. The P-near epoch dependence condition 
constitutes a generalization of the usually considered Lp-near epoch dependence, p > 1, that 
allows for arbitrarily heavy-tailed data. 

We investigate the test numerically and compare it to previous proposals. 
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1. Introduction 
The problem of detecting changes in the distribution of sequential observations has a long 



history in statistics, see e.g. Csorgo and Horvath (1997). However, particularly detecting 



Key words and phrases. Change-point analysis, Kendall's tau, [/-statistic, functional limit theorem, near 
epoch dependence. 
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changes in the dependence structure of multivariate time series has just recently attracted 



the focus of statistical research (Galeano and Peiia 2007 Aue, Hormann, Horvath, and 



Reimherr 2009 Kao, Trapani, and Urga 2011). 



The authors' interest in the field originates in the economic data analyses. For risk manage- 
ment and portfolio optimization the dependence between financial asset prices is of enormous 
importance. It is often assumed to be constant over the observed time period, which is a 
simplifying assumption that is evidently violated for longer observation periods. For good 
statistical modeling and successful decision making it is essential to detect changes in the 
association of financial price processes and, within reasonable time frames, re-estimate the 
correlation parameters. Particularly, in times of global financial crises, the price processes of 
most financial assets tend to be highly dependent, united in their common downward trend, 
causing the hedging powers of investment diversification to cease — an effect, for which the 



term diversification meltdown has been coined (Campbell et al. , 2003). 



Recently, Wied, Kramer, and Dehling (2012) proposed a test for change in correlation 



based on Pearson's moment correlation. We recommend to use Kendall's tau instead of 
Pearson's correlation coefficient because it possesses a higher efficiency at heavy tailed dis- 
tributions. For details see Section |5l 

The paper is organized as follows: Section [2] contains the main results, which concern the 
asymptotic behavior of the test statistic under the null hypothesis of no change in correla- 
tion. Theorem 2.6 gives the asymptotic distribution of the test statistic. The asymptotic 



distribution contains a long run variance parameter; Theorem 2^ shows the consistency of 
a proposed estimator for the long run variance parameter and allows hence to formulate an 
asymptotically distribution-free version of the test statistic. Sections 3 contains the theoreti- 



cal groundwork: two functional limit theorems for weakly dependent processes (Theorems 3.2 
and 3.5), which are the cornerstones of the proofs of the results of Section [21 but which are 
also of interest in their own right. Section |4] deals with estimating the location of a potential 
change-point. 

Section [6] examines the properties of the proposed statistical procedure numerically, and 
Section [7] contains applications to real life data examples. All proofs are deferred to the 
appendix. 



We use bold type face to denote vector-valued objects 
p-norm in R'^, p G [1, oo), g G IN. 
X, we write (E|X|p)^/^, p G [1, cx) 



Throughout, 



denotes the 



To denote the Lp norm of a real-valued random variable 



2. Statement of Main Results 

Let ((Xj, Yi))i>i be a stationary process of bivariate random vectors with marginal distri- 
bution function 

F{x,y) = P{X,<x,Y,<y). 

Throughout the article, we assume F to be Lipschitz continuous. For practical purposes 
this is fulfilled if F possesses a bounded density, but, for instance, X = Y is also allowed. 
Kendall's rank correlation coefficient, also known as Kendall's tau, is a measure of depen- 
dence between the marginals Xi and Yi. Kendall's tau is defined as 

T = P{{X' -X){Y' -Y) >0), 

where {X, Y) and {X', Y') are two independent random variables with distribution function 
F{x,y). The sample version of Kendall's tau is defined as 

^n = -7^#{1 <t<J<n:{Xj- X,){Yj - Y^ > 0}. 
[2) 
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Remark 2.1. 

(i) The pairs (Xj, Yj) and (Xj, Yj) are called concordant, if Xj — Xi has the same sign as 
Yj — Yi. Thus, f"„ gives the fraction of concordant pairs among all pairs. 

(ii) Often, Kendall's tau is defined slightly differently as 

f = P{{X' - X){Y' - r) > 0) - P{{X' - X){Y' -Y) < 0), 

which has the same range [—1, 1] as Pearson's linear correlation. For continuous F, 
which we assume throughout, we have f = 2 f — 1, and the same one-to-one correspon- 
dence holds almost surely for the respective sample versions. 

(iii) The estimator fn is a ^/-statistic with kernel /i : R^ x R^ — > R defined as 
(1) h{{xi, yi), {X2, 1/2)) = l{{x2-xi)(j/2-j/i)>o}- 

We will make use of this fact in our analysis of the asymptotic distribution of f„. 

In this paper, we will study a test for change in the dependence structure of the marginals 
by the test statistic 

Tn = max -^in -i"n|- 

l<k<n y/n 



Theorem 2^ below gives the asymptotic distribution of the test statistic under the null 
hypothesis of no change in the correlation between the marginals of the pairs (Xj, Yj). Con- 
cerning the serial dependence structure of the process ((Xj, Fj))j>i, we assume that it is 
P-near epoch dependent (P-NED) on an absolutely regular process. The formal statement 
of this short range dependence assumption follows below. For simplicity of notation and 
consistency with large parts of the literature it is formulated for doubly infinite sequences of 
random vectors indexed by Z. The observed data is then the positive branch of the doubly 
infinite sequence. 

For two cr- fields A, B on the probability space (fi, ^, P), we define the absolute regularity 
coefficient 

(5{A,B) =E[esssup{|P(A|S)-P(A)| : A e A}] . 

The absolute regularity coefficient is a measure of dependence between the a-fields A and 
B, it lies between and 1, and equals if ^ and B are independent. 

Definition 2.2. Let (X„)„g^ and (.Z'„)„g^ be q- and r-variate stochastic processes on 
(r2,^,P), respectively, g, r > 1, such that the (g + r)-variate process ((X„,Z„))ng^ is 
stationary. For k < n, let ^J} = a{Zk, ■ ■ ■ , Zn), where also k = —00 and n = 00 are 
permitted. 

(i) The process (.Z'„)„g^ is called absolutely regular if the absolute regularity coefficients 

converge to zero as A; —t- 00. 
(ii) The process {Xn)nei. is called Lp near epoch dependent (Lp-NED), p > 1, on the process 
{Zn)nei. if the approximating constants 

1 
ap,, = (e |Xo - E{Xo\^^,)\iy, k>l, 

converge to zero as A; — )■ 00. 
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(iii) The process {Xn)n£'z is called near epoch dependent in probability or short P-near epoch 
dependent (P-NED) on the process {Zn)ne'z if there is a sequence of approximating 
constants (afc)fceiN with a^ — )■ as fc — )■ oo, a sequence of functions fk : ]R^><(2fc+i) _). j^i^ 
A; G IN, and a non-increasing function $ : (0, cxd) — )■ (0, oo) such that 

(2) P(|Xo-/fc(Z_fc,...,Zfc)|i>£) < ak^e) 

for all A; G IN and e > 0. 

Remark 2.3. 

(i) The usual Lp-near epoch dependence, p > 1, is of lesser interest in the following. It is 
mentioned primarily to be put in contrast to P-near epoch dependence. It also appears 
in the proofs, where we make use of results formulated for Li- and L2-NED sequences. 
The main connection between the different approximation concepts is given by Lemma 
EJ below. 

(ii) The P-NED condition is equivalent to convergence in probability of fk{Z_k, . . . , Z^) 
to Xq for k — )■ 00. If the latter is true, i.e., if 
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fulfills pi). The requirement of the bound on P {\Xo — fk 



'k)\i 



> e) m 



B 



to factorize into an e-part and a fc-part is not a restriction, but merely facilitates rate 

computations. 

Similar conditions that embody the idea of approximating functionals in a probability 



sense are S-mixing considered by Berkes, Hormann, and Schauer (2009) and the Lq- 



stability concept of Bierens (1981). 



approximability of Potscher and Prucha (1997, Chapter 6), who refer to the stochastic 



The terminology near epoch dependence may be interpreted as follows: Near epoch 
dependence (in either definition) implies that there exists a measurable function / : 
j^dxiN _^ ^d g^pj^ ^Yiat Xq = f{{Zk)kGz) and by the stationarity of ((X„, Z„))„e2; 
also Xn = f {{Zn+k)k£z) for all n ^ Ij. Thus in principle, X„ depends on the whole 
process {Zn+k)k£Z, but the NED condition ensures that the dependence vanishes for 
the distant past and future, and Xn primarily depends on the "near epoch" of Z„. 
An alternative terminology is "(X„)„g^ is an approximating functional of (Z„)„g^" 
(where, as for NED, it remains to specify in which sense the approximation is meant). 

We choose to consider P-near epoch dependence instead of the more frequently considered 
Lp near epoch dependence, p < 1, since we particularly want to analyze heavy tailed data 
and do not want to assume the existence of even first moments. The P-NED condition is a 



weaker assumption than Lp-NED, see Lemma 2.5 below, and the main results of this paper, 
in particular Theorems 12.61 12.91 13.21 and 13.51 hold for Lp-NED sequences as well. On the 



other hand, the P-NED condition substantially enlarges the class of processes for which the 
condition is easily checked by many heavy-tailed distributions. An example is given in the 
following lemma. 
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Lemma 2.4. Let {Zn)nez be an i.i.d. sequence of M!^ -valued with P{\Zn\i >t)< Ct^" for 
some a, C > 0. Then for a E (—1,1), the autoregressive process 



^n — 2_^^ Zin-k 
k=0 

is P-NED on {Zn)n&'E with a^ = |a|"'^ and $(£:) = Ke^'^ for some constant K > 0. 



Lemma |2.4| poses very weak conditions on the innovation distribution, in particular, the 
existence of a density is not required. This should be compared to analogous conditions for 
an AR(1) process to be mixing, see e.g. [Withers] (1981). All standard examples of discrete 



or heavy-tailed (e.g. Pareto, Cauchy, geometric) distributions are included here. The next 



lemma connects P-NED and Lp-NED. 



Lemma 2.5. Let {{Xn, Zn))nez be as in Definition 2.2. 

(i) If {Xn)n&'z is P-NED on {Zn)n&i, with functions $ and fk, /c G IN, and approximating 
constants {ak)ke¥i, o-nd g -.W -^ 'U? is a Lipschitz continuous function with Lipschitz 
constant L, then the process {g{Xn))n£i, is P-NED on {Zn)ne'z with functions ^{e) = 
$(e/L) and g o f^, /c G IN, and the same approximating constants {a^jkem- 

(a) Let {Xn)n£Z be bounded and P-NED on {Zn)nei. with functions $ and fk, A; G IN, 
and approximating constants {ak)keTM- Then {X„)ne'z is Lp-NED on {Zn)n£ij for any 
p > 1. If there is further a sequence {sk)k&'M of non-negative numbers such that 

(3) afc$(sfc) = 0{sk) {k -^ oo), 

then the Lp-NED approximating constants {ap^k)ke'M of {Xn)n&'z satisfy a^^ = 0{sk) 
for A; — > oo. 
(Hi) Let {Xn)n&, be Lp-NED, p > 1, on {Zn)n&'E with approximating constants {ap^k)k&i- 
Then {Xn)n£'z is P-NED on (Z„)„g^. // there is further a non-increasing function 
$ : (0, oo) — )■ (0, oo) and a sequence {sk)keT!Si of non-negative numbers converging to 
zero that satisfy 

'Op^Y 
e J ' 

then {Xn)nei. is P-NED on {Zn)ne'z with approximation constants {sk)keK and function 
$. The functions fk can be chosen as fk{Z_k, ■ ■ ■ , Zk) = E{Xq\^'^j^), A; G IN. 



^{e)sk > (- 



Since $ is non-increasing, condition (|3j) puts an upper bound on the speed of decay of 
(■Sfc)fce]N in the sense that, if ^ is fulfilled by some sequence {sk)k£m, then it is also fulfilled 
by any sequence {sk)ke¥i for which Sk < Sk for all k larger than some n G IN. We are now 
ready to formulate our first main theorem, which concerns the asymptotic behavior of the 
test statistic T^ under the null hypothesis of stationarity. 

Theorem 2.6. Let {Xi,Yi)i>i be a two-dimensional, stationary process that is P-NED with 
approximating constants {ak)k>i and function $ on an absolutely regular process with absolute 
regularity coefficients {f3k)k>i such that 

(4) ak^k'^^+'^) = 0{k~^''+'^) and (3k = 0(A;-(^+^)) 
for some e > 0. Then 

(5) fn^2D sup \B{X)l 

0<A<1 
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where (-B(A))o<a<i denotes a Brownian bridge, 

oo 

(6) D^ = Var(^(Xi, Y^)) + 2 J^ Cov(^(Xi, Yi), ^(X„ Y,)) 

i=2 

andi){x,y) = 2F{x,y) ~ Fx{x) - Fyiy) + 1 - r. 

Remark 2.7. The distribution of supo<A<i |-B(A)| is known and often referred to as Kol- 
mogorov distribution. It is the hmiting distribution of the two-sided Kolmogorov-Smirnov 

test statistic. Its cdf is 



(7) F;,(x) = l + X;(-l)V 



2k^x^ 



a; > 0. 



fc=i 



In order to carry out the test in practice we need an estimate of D^. In principle, any 
consistent estimate D"^ of D^ can be used to obtain the convergence result (10), and the 



choice of the estimator certainly depends on the data situation. We follow in our proposal 



the kernel based estimation technique by de Jong and Davidson (2000 ), who prove asymptotic 



results for L2-NED sequences. Let F„, Fx,n and Fy,i be the empirical distribution functions 
of {{Xi,Yi))i=i^,„^n, (-^j)j=i,...,n and (yj)i=i,...,„, respectively, and 

i,n,i = 2F„(X,, Yi) - Fx,n{Xi) - Fy,„(K,) + 1 - f„. 

Then define 

n— 1 / . \ n—j 

/ ^ Wn^iWii^i+ji 



Dt 



Y.^1 



n ■' — ' ■"'" n ^ — ^ 



hn 



i=l 



where k : [0, 00) — )■ [—1, 1] is a kernel function satisfying Assumption 2.8 below and 6„ is a 
bandwidth parameter depending on n. Assumption |2.8 mainly coincides with Assumption 1 
of de Jong and Davidson ( 2000 ) . 



Assumption 2.8. The kernel function k, : [0, 00) — ;■ [—1, 1] with k,{0) = 1 is continuous at 
and at all but a finite number of points. Furthermore, \k\ is dominated by a non-increasing, 
integrable function and 



[0,00) 



/t(t) cos{xt)dt 



[0,00) 



dx < 00. 



Theorem 2.9. Let {Xi,Yi)i>i be a two-dimensional, stationary process that is P-NED with 
approximating constants {ak)k>i on an absolutely regular process and absolute regularity co- 
efficients {(3k)k>i such that 



(9) afc$(A;-(i2+-)) = 0(A;-(i2+e)) 



and 



/3, = 0(fc-(9+^)) 



for some 6 > 0. Let furthermore n be a kernel function satisfying Assumption 2.^ and (fen)ngiN 
be a non- decreasing sequence of natural numbers such that &„, — )■ 00 and hn = o(n~^/^). Then 



(10) 



Tn_ 

2Z). 



V 



sup |-B(A)|, 

0<A<1 



where (-B(A))o<a<i is, as before, a Brownian bridge. 
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3. Invariance Principles for P-NED Sequences 
The proofs of the resuhs of the previous section are based on two fundamental functional 



limit theorems for weakly dependent data. The key ingredient of the proof of Theorem 2.6 



is Theorem |3.2| below, an invariance principle for [/-statistics. Later in this section we state 
in Theorem |3.5| a multivariate empirical process invariance principle, which is essential for 



the proof of Theorem |2. 9 

Let 5f : R'^ X R^ — )■ R denote a symmetric kernel, and let (Xj)j>i be a d- dimensional, 
stationary stochastic process with marginal distribution function F : R"' — )■ [0, 1]. We define 
the tZ-statistic [/„ as 



Un = Un{g) = -pr J] 9{Xi,Xj)- 



l<i<j<n 

In order to analyze the asymptotic distribution of Un, we introduce the Hoeffding decompo- 
sition. Define 

9 = Eg{X,X'), 

(11) g^{x^) = E{g{x^,X)) - 6, 

g2{xi,X2) = g{xi,X2)- gi{xi)- gi{x2)-9, 

where X and X' are independent random variables that each have the same distribution as 
Xi. Note that by definition, we get 

g{xi, X2) =6 + gi{xi) + gi{x2) + g2{xi, X2). 

The Hoeffding decomposition of f/„ is given by 



2 1 

i=l V2/ l<i<j<n 

For the [/-statistic invariance principle to hold, we require the kernel g to fulfill the following 
regularity condition. 

Definition 3.1. The kernel g satisfies the variation condition on (Xj)j<i if there exist 
constants L,eo> such that 



E 



sup \g{x,x') - g{X,X') 

\ixy)-iX,X')\2<e 



< Le, 



for all e G (0, sq), where X and X' are independent random variables, identically distributed 

as Xi. 

Theorem 3.2. Let (Xj)j>i be a d- dimensional, hounded, stationary process that is P-NED 
with approximating constants {ak)k>i on an absolutely regular process with coefficients {Pk)k>i 
satisfying ^. Furthermore, let g : H'^ x Mf'' ^ H be a bounded kernel satisfying the variation 
condition. Then 

(12) [V^XiU^nX] - ^))o<,<i ^ (2aW^(A))o<,<, , 

where 

00 

(13) a' = Var((7i(Xi)) + 2 J^ CoY{g^{X,),g^{X,)). 

i=2 

and {W{X))q<\<i denotes a standard Brownian motion. 
Remark 3.3. 
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(i) Weak convergence in (12) is in the space -D([0, 1]), equipped with the Skorokhod metric. 
Alternatively, one may consider a linearly interpolated version, i.e. 

JTi^(Uk-e) forA = ^ 

linearly interpolated in between. 



WniX) 



n 



Then {Wn{X))o<x<i converges in distribution to {2(rW{X))Q^^^^, in the space C([0, 1]). 

The formulation of this invariance principle is specifically tailored to the needs of our 
application and does not strive for utmost generality. In particular the boundedness 
of (? is a rather strict requirement that can be relaxed to a uniform bound on the 
(2 + r7)-moments of g{Xi,Xf:) for some rj > 0. Contrary to the independent case, the 
existence of exactly second moments is generally not enough for series of dependent 
random variables. However, the relaxation of the boundedness must be paid for by a 



faster decay of the /3fc and a^ coefficients. For details see Wendler (2011, Chap. 3) 



m 



Denker and Keller (1986) proved a central limit theorem for t/-statistics of approximat- 



ing functionals of absolutely regular processes. A functional central limit theorem for 



^/-statistics was established for absolutely regular processes by Yoshihara 



by Denker and Keller ( 1983 ) under different sets of conditions. Theorem 



(1976) and 



3.2 extends 



these results to the much larger class of functionals of absolutely regular processes. 



Corollary 3.4. Under the assumptions of Theorem 3.2 



V 



(14) (V^A(f/[„A] -f/n))o<,<i^(2^i?(A))o<A<l, 

where (-B(A))o<ai is a Brownian bridge and a^ defined in (13) 



For the proof of Theorem 2^ we require in particular that the distribution function F of 
(Xi, Yi) is sufficiently well approximated by the empirical distribution function F„ (stemming 
from weakly dependent observations). This is ensured by the following functional limit 
theorem for the empirical process. 

Theorem 3.5. Let {Xi,Yi)i>i be a two-dimensional, stationary process that is P-NED with 
approximating constants {ak)k>i on an absolutely regular process and absolute regularity co- 
efficients {(3k)k>i satisfying M). Then the empirical process 

{V^{F4s,t)-F{s,t)))^^^^^ 
converges weakly to a centered Gaussian process {W{s,t))s,t&R with covariance function 



E{W{s,t)W{s',t'))= J2 Cov(l|x„< 



s,Yo<t}i ^{X^<s',Y^,<t'} 



)• 



4. Change-point Identification 

If the test rejects the null hypothesis of constant correlation, and if it is furthermore 
reasonable to assume that there is one sudden change-point, it is, of course, of interest 
to locate of this change-point. An intuitive estimator, which is common when dealing with 
CUSUM-type change-point tests, is the position at which the weighted correlation differences 
take their maximum, that is 

On = arg max —^ \ t^ 



l<k<n 



T„ 



n 



We will show in the following that this is indeed a reasonable estimator. We will assume 
that the following model holds. 
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Model 4.1 (Change-point model). Let < 6 < 1. For n G IN let {{Xi^n,yi,n))i<i<ibn] 
and ((Xj„, Fj„))[5„]+i<j<„ be two bivariate, stationary stochastic processes with marginal 
distribution functions F and G, respectively. Let furthermore ((Xj,n, ^i,n))i<i<n be P-near 
epoch dependenlj^ on an absolutely regular process with coefficients satisfying (|9| uniformly 
for all n. 

The goal is to estimate b. Let Tp and tq denote Kendall's tau of F and G, respectively. 
Moreover, let tfg = S/i((Xi, Fi), (Xs, ^2)), where (Xi,Fi) ~ F and {X2,Y2) ~ G are 
independent. 

Assumption 4.2. The values b,TF,TG,TFG £ [0,1] are such that |c(A)|, A G [0,1], has a 
unique maximum at X = b, where the function c : [0, 1] — ?■ R is given by 



(15) c(A) 



f [(1 - b^)rF - (1 - 6) Vg - 26(A - b)TFG] A for 0<\<b, 

\2b{TFG - rG){l - A) + b\TF + TG- 2tfg) (^ " A) for b < X < 1. 



Theorem 4.3. If {{Xi^n,yi,n))i<i<n,nG¥i follows Model 4-1 and Assumption 4. 2 holds, then 



J-a V 



n 
as n — 7- 00. 



Remark 4.4. Assumption \4.S\ is satisfied in "most" situations. It is true if tf 7^ tg and 

C^-by ^ Tfg - Tf 

2((l-6)2 + 6) - TG-TF 

It is violated if 

Q ^ Tfg - TF ^ {l-bf 



TG-TF 2((l-6)2 + 6)' 

i.e., if Tfg is too close to tf. It is an open research question which values of tfg are possible 
for given tf and tg, and in particular if tfg nnay at all lie outside the interval [tf,tg\. 

5. Properties of the Test and Comparison to Previous Proposals 



Wied, Kramer, and Dehling (2012) propose a test for constant correlation based on Pear- 



son's linear correlation coefficient 

Eti(x,-Xfc)(F,-n 



Qk 



1/2 / . _ \ 1/2' 



The test statistic 

k 



Tg^n = max -^\gk- ^„| 

l<fc<n wn 



is shown to converge in distribution to i5gSupo<t<i \B(t)\ as n — > 00 for NED sequences 
under the null hypothesis of no change in correlation, where B{t) is a Brownian bridge and 



For non-stationary processes the short-range dependence conditions have to formulated shghtly 
more general than in Definition 2.2 The absolute regularity coefficients (/3fe)feg]N are defined as l3k — 

-00, -^t+k) 



supjg^ /3(c^loQ, ^j°?fc), ^^d the P-NED approximation coefficients {ak)keK must satisfy 



sup P {\Xt - fk.ti^t^k,---,Zt+k)\^> e) < afe$t(e), 
fez 

where the functions fk.t and $t may also depend on t. The underlying process {Zt)tez is not required to be 
stationary. 
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Dg an appropriate scaling factor. This test, devised for normal data, shall serve as the 
main benchmark method for our test. Our motivation for using Kendall's tau is the wish to 
efficiently detect structural changes in arbitrarily heavy-tailed and potentially contaminated 
data. Both tests are constructed in a similar way. The differences between the two tests are 
largely due to the different properties of the estimators qu and ffc. It is therefore worthwhile 
to have a closer look at these two correlation measures. 

The copula of the two-dimensional distribution function F is 

C : [0, If ^ [0, 1] : {x,y) ^ F {F^\x) , Fy\y)) , 

thus F{x, y) = C{Fx{x), Fyiy))- Kendall's tau is a function of the copula, 

(16) T = 2 [ C{u,v)dC{u,v) = 2EC{U,V) = 2EF{X,Y), 

where U,V are two uniformly on [0, 1] distributed random variables with joint cdf C, e.g.. 



U = Fx{X) and V = Fy{Y), cf. Nelsen (2006, Chap. 5). The distribution function of 



C{U, V) is also called the Kendall distribution function of the copula C. Kendall's tau 
consequently is invariant with respect to monotonic, componentwise transformations of F. 
Furthermore, f„ is asymptotically normal at the ^/n rate with asymptotic variance 

(17) ASV{fn) = 4Var(V'(X, Y)) = 4Var {2C{U, V)-U-V) 

for i.i.d. observations of any continuous distribution F, where ?/' : R^ — )■ R is defined in 



Theorem 2.6 Note that, instead of (16) and (17), the corresponding values for r = 2f — 1 



are usually given in the literature. Thus, no matter how large the tails of the distribution F 
are, as long as the marginals are joined by a Gauss copula, the asymptotic variance of f^, 
and hence the asymptotic distribution of r^-.n, are the same as in the normal model. 

A very popular class of multivariate distributions, which offers a convenient way of model- 
ing data with arbitrarily heavy tails, is the elliptical model. A two-dimensional, continuous, 
centered, elliptical distribution F has a density / of the form 

(18) fix) = det(5)-^/2^ (x^'S-^x) , x = {x, yf e R^ 

where 



s 

\Sl,2 S2,2 

is a symmetric, positive definite matrix and 7 : [0, 00) — )■ [0, 00) a univariate function. We 
use the notation ^^2(0, 5*) for this class of distributions. If the second moments of F are 
finite, then 

■51,2 



(19) g 



a/Si, 1^2,2 



is Pearson's linear correlation coefficient. Otherwise we use (19) as definition for the gen- 
eralized linear correlation coefficient of the elliptical distribution F. In the elliptical model 
there is a one-to-one correspondence between g and r: 

1 1 

r = — arcsin(^) H — , — 1 < f? < 1- 

71 2 

Thus by letting 

gr,n = sin(7i(f„ - 1/2)), 



the estimators gn and gr,n are both Fisher-consistent for the same quantity g, cf. (19), in the 
elliptical model. Comparing their asymptotic variances allows a prognosis concerning the 
efficiency relation of the corresponding change-point tests. 
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If 4th moments of F G (^2(0, S) are finite, then the asymptotic variance of Qn, computed 
from independent realizations of F, is 

(20) A^\/(^„) = (l + ft:/3)(l-f?2)2, 

where k = E{X^) / (ElX^))"^ — 3 is the excess kurtosis of any component of X 

F. At the two-dimensional normal distribution, i.e. for 7(0;) = ^e'^^"^, the 



kurtosis k is equal to zero. For the two-dimensional t-distribution with ly degrees of freedom 
(denoted by ^2,1^ in the following), we have 7(x) = c,^{l — x/z/)~^~ for some normalizing 
constant c^, and k = Q/^u — 4) for z/ > 5. The asymptotic variance of Qr^n at the normal 
model is 

^2 



(21) ASV{Qr,n) 



^- ~ Q )[ 4arcsin (- 



"9 



)), 



see 



Croux and Dehon (2010). For example, for two independent random variables we have 
ASV^Qn) = 1 and ASV^Qr^n) = 7r^/9 = 1.097. For uncorrelated, jointly ^2,1^ distributed 
random variables, Dengler (2010) gives the following expression for the asymptotic variance 

of Qr,n, 



ASV{Qr,n) 



32r(f) 



u'' ""^arctan^fw 



Jo 



{i-ty-^{u' + t)-''dtdu, 



vr2r3(|) J, 

and derives explicit expressions. The asymptotic variance ASV{Qr^n) is a decreasing function 
of i^, it equals 1.922 and 1.296 for z/ = 1 and z/ = 5, respectively, and is smaller than ASV{Qn) 
for V < 16. 

The maximum likelihood estimator Qt(v),n of Q at the ^2,1/ distribution has asymptotic 
variance 

ASV{Qt(u),n)='':^{l-Q^)\ 



z/ > 1, 



which, for £> = 0, is equal to 1.667 for u = 1 and 1.286 for z/ = 5, see Bilodeau and Brenner 



(1999, p. 221). We note the remarkable fact that for all two-dimensional, uncorrelated t- 
and normal distributions the asymptotic relative efficiency of Kendall's tau with respect to 
the respective MLE is above 90% for z/ > 2. It is more than 99% at an uncorrelated ^2,5 
distribution. 

The other popular nonparametric correlation measure. Spearman's rho, which is often 
considered along with Kendall's tau, is defined as Pearson's linear correlation of the ranks 
of the data. It can be written as 

^^ ^X^i?n(X,)i?„(F,) - 3'' + ^ 



(n — 1)72(72 + 1) 



4 = 1 



n 



where Rn{Xi) denotes the rank of the ith observation Xi among Xi, . . . , X„, likewise Rn{Yi). 
The population version of Spearman's rho. 



(22) s = 12 / uvdC{u,v)-3 = 12E{UV) - 3 = 12E {Fx{X)Fy{Y)) - 3, 

J[0,l]2 

is also a function of the copula. Generally, Kendall's tau and Spearman's rho have similar 



statistical properties. See, e.g., Nelsen (2006, Chap. 5) for details on their relationship. Croux 



and Dehon (2010) compare both with respect to robustness and efficiency at the normal 



model and arrive at the conclusion, that in both respects their performance is comparable, 
but Kendall's tau is slightly favorable. Wied, Dehling, van Kampen, and Vogel (2011) 
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propose a non-parametric, robust change-point test for constant correlation for strongly 
mixing sequences that is closely related to Spearman's rho. They consider the test statistic 

Ts^n = max - 

l<fc<n - 

where 



max -^=\Sk 

l<k<n -v/n 



12 A „ __„ ,_. 12 



(23) 4 = ^ V Rn{X,)Rn{Y^ - 3 - -, l<k<n. 

i=l 

The proof of its convergence is based on an invariance principle for the multivariate se- 



quential empirical process by Riischendorf (1976). Despite the mentioned practical parity of 



Kendall's tau and Spearman's rho, this test has a low efficiency compared to our Kendall's 
tau based test (see Section [6]). The reason lies in the usage of i?„(-) instead oi Rk{-) in (23). 



Spearman's rho is asymptotically equivalent to a fZ-statistic of order 3, see Moran (|1948|), 
and an asymptotic analysis of the related test statistic 

rri \ ^ 

-tr,n — max 1=1^ k ^n\ 
l<k<n Wn 



by means of [/-statistics theory is mathematically much more involved. Since Spearman's 
rho, on the other hand, exhibits no pronounced advantage over Kendall's tau, we do not 
pursue this further here. Finally, we note that both estimators require a comparable com- 
puting effort. Both can be computed in O(nlogn) time. Simple algorithms to compute the 
test statistics require O(ra^). 

6. Simulation Results 

In this section we give some numerical results, primarily addressing two questions: we want 
to (A) examine the goodness of the asymptotic approximation of the distribution of the test 
statistic for finite n under different dependence scenarios and (B) compare the performance of 



the test to the proposals by Wied et al. (2012 , 2011 ) with respect to efficiency and robustness. 



Furthermore (C) we demonstrate the applicability of our test at an example which is neither 
absolutely regular nor an Lp approximating functional, p > 1, of an absolutely regular 
process, but which is easily be shown to be P-NED on an i.i.d. process. 

For objectives (A) and (B) we consider two simple models that substantially differ with 
respect to the strength of the serial dependence. Both fit into the framework of the following 
general model. 

General model. The random vectors {6i,ei), i & It, are independent and identically 
distributed, each having a bivariate, centered elliptical distribution ^2(0, S) with 

where the shape parameter \g\ < 1 is equal to the usual moment correlation if the second 
moments of {Si,ei) are finite. The series ((Xj,l^))ig^ then follows the AR(1) process 

with AR parameter \ip\ < 1. 
Model 6.1. ^ = 0. 
Model 6.2. ^p = 0.8. 
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The independent observations of Model 6.1 constitute the "good case" scenario (the best 



reahstically assumable case), while Model 6.2 implements a dependence scenario with strong 



positive autocorrelations, where we expect the test to be less efficient. 

Throughout we estimate the cumulated covariance D^ by the estimator D"^ given by (|8 
where we choose k to be the quartic kernel 



k{x) 



X 



2\2n 



-[-1,1] 



X 



and the bandwidth bn = [2n^^^\ . 

Objective (A). If we let the distribution of {6i,ei) be bivariate normal with ^ = 0, the 
margins of (Xj, Yi) are independent, i & It, and, in both Models 6.1 and 6.2, D^ can be given 
an explicit form. We have 



D' 



1 

36 
for Model \QA\ and 



0.027 



D' 



1 
36 



+ 



TT' 



E 



arcsm 



0.8^ 



0.12099 



for Model 6.2 The values can be deduced from Lemma D.l in the appendix. Recall that 
arcsin(x) is very close to x for x close to zero. This allows us to compare how well Tn/{2Dn 
and Tn/{2D) are approximated by their common limit distribution. In Figures ll and p, 
the cdf Fx of the limiting Kolmogorov distribution, cf. (JTl), is plotted along with severa 
empirical distribution functions, each based on 5000 repetitions. The left panels show em- 
pirical distribution functions of T„/(2Z}) for differ ent n and the right panels empirical cdf's 
of T„/(2Z)„). In Figure |l| the results for Model 6.1 with ^ = are displayed. While for 
n = 100, Tn/{2D) is already very close to its limit distribution, there is some considerable 
'(21), 



bias for T„ 



which vanishes for n > 500. The results for Model 



6.2 



(also with ^ = 0) 

in Figure |2j look different. The convergence of T„/(2D) is much slower. For n = 1000 there 
is still a considerable gap between the empirical cdf and its limit. Somewhat surprising but 
good news is that T„/(2D„) converges faster in this situation. Here the asymptotics are 
usable for n > 500. For both models we ran simulations for n = 10,20,50,100,500 and 
1000. For clarity of visual representation, some of the curves are omitted. Where the curves 
for n = 500 and n = 1000 are not displayed, they practically agree with the graph of Fk- 
Objective (B). Guided by the observations above we use a fixed sample size of n = 500 



for the efficiency c omparison of our test to the previous proposals by |Wied et al. | ( |2012j ) and 
Wiedet al.|(|201l|). 



These tests are based on Tg^n and Tg, 



respectively (cf. Section p) and 
referred to as Pearson test and copula test in the following. The variance estimation for 
these test statistics is done according to the authors' proposals, which both also implement 
kernel estimators following de Jong and Davidson (2000). We also take the quartic kernel 
and the bandwidth 6„ = [2n"^/'^J. 

For the ffist half of the data, we sample independent realizations {6i, Si), i = 1, . . . , 250, 
with correlation parameter qi = 0.4. For the second half of the data, we use the correlation 
parameter g2, for which we allow the values 0.4 (null hypothesis), 0.6, 0.8, 0.2, 0, —0.2, —0.4. 
Thus in Model 6.1, where (Xj,l^) = {6i,ei), we have a constant correlation of 0.4 at the 



beginning and then a sudden jump. In Model 6.2, there is an abrupt jump in the correlation 
of the innovations {Si,ei) at time n/2, which means a gradual but quick change in the 
correlation of the observed process (Xj,Fj). Note that the stationary processes {{Si,ei))i^z 
and {{Xi,Yi))i^x have the same correlation. 
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Figure 1. Convergence of T„/(2L)) (left) and T„/(2L)„) (right) to the Kol- 
mogorov distribution for serial independence; empirical cdfs based on 5000 
repetitions. 
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Figure 2. Convergence of T^/{2D) (left) and T„/(2L)„) (right) to the Kol- 
mogorov distribution for independent Gaussian AR(1) processes with AR pa- 
rameter ip = 0.8; empirical cdfs based on 5000 repetitions. 



We consider five different elliptical distributions for {Si,ei): the bivariate normal distribu- 
tion and bivariate t-distribution with 20, 5, 3 and 1 degrees of freedom. The ^20 distributions 
has slightly heavier tails than the normal, whereas ts, ^3 and ti serve as examples of very 
heavy tailed distributions. The tj, distribution possesses finite moments of order z/ — 1, but no 
higher whole number. Thus ts is the "smallest t-distribution" for which the Pearson test by 



Wied et al. (2012) works, and ts is the "smallest t-distribution" for which Pearson's moment 



correlation is defined. 

For each combination of model, jump height and marginal distribution we generate 1000 
samples and compute the three test statistics from each sample. The observed rejection 
frequencies at the significance level .05 are given in Tables [T] and |2] for Models 6J- and 6.2 
respectively. At Table [T] we note the following. 

(1) The Pearson test is slightly better than the Kendall test at the normal distribution. Both 
tests lose power with increasing tails, but the loss is much smaller for the Kendall test. 
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Table 1. Efficiency comparison of three correlation change-point tests un- 
der Model 6.1 Different marginal distributions, 500 observations, different 



jump sizes in the middle of the sample. Empirical rejection frequencies at the 
asymptotic .05 level based on 1000 repetitions. 

Jump at n/2: none -.2 +.2 -A +.4 -.6 -.8 



Distribution 


Test 
















normal 


Pearson 


.04 


.46 


.70 


.97 


1.00 


1.00 


1.00 




copula 


.04 


.06 


.07 


.22 


.20 


.47 


.78 




Kendall 


.05 


.44 


.65 


.96 


1.00 


1.00 


1.00 


^20 


Pearson 


.04 


.42 


.65 


.97 


1.00 


1.00 


1.00 




copula 


.03 


.07 


.08 


.22 


.20 


.47 


.77 




Kendall 


.04 


.46 


.63 


.97 


1.00 


1.00 


1.00 


^5 


Pearson 


.04 


.24 


.41 


.73 


.95 


.95 


.98 




copula 


.04 


.08 


.08 


.22 


.20 


.46 


.76 




Kendall 


.04 


.41 


.55 


.95 


1.00 


1.00 


1.00 


h 


Pearson 


.06 


.14 


.25 


.39 


.69 


.64 


.79 




copula 


.03 


.08 


.08 


.21 


.18 


.43 


.72 




Kendall 


.03 


.39 


.52 


.91 


1.00 


1.00 


1.00 


tl 


Pearson 


.47 


.48 


.50 


.49 


.56 


.52 


.51 




copula 


.03 


.06 


.07 


.17 


.17 


.38 


.63 




Kendall 


.04 


.29 


.38 


.83 


.98 


1.00 


1.00 



For the t2o distribution, the results are comparable. The Kendall test is clearly better 
for heavier tails. These observations are fully in line with our expectations considering 
the efficiency comparison of the respective correlation measures in Section |5j 

(2) Throughout, the copula test has a very low power. 

(3) A positive jump (from the positive correlation g = 0.4) is better detected by the Kendall 
and the Pearson test than a negative jump of the same height. This also to be ex- 
pected considering that correlation measures generally have a smaller variance if the 



true absolute correlation is large, cp. also (20) and (21). 



(4) The Pearson test has no mathematical justification if fourth moments do not exist. For 
the ta distribution it gives nevertheless approximate results, whereas for the ti distribu- 
tion it is completely useless. 

Analyzing Table [2] we find that 

(5) the power of all tests is lower for the AR(1) process than in the independent case, and 

(6) the observations made at Table [I] concerning the comparison of the tests generally also 
apply here. 

There are some differences, though. 

(7) For normal, ^20^ ^5 and ^3 innovations, the performance of the Kendall and the Pearson 
test are rather similar. The effect of the heavy tails is less pronounced than in the 



independent case. This is not entirely surprising. In Model 6.2 






00 

fc=0 



^i-k 
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Table 2. Efficiency comparison of three correlation change-point tests. 
{{Xi,Yi))i=i^,,,^n AR(1) process with AR-parameter ip = 0.8. Different innova- 
tion distributions, 500 observations, several alternatives. Empirical rejection 
frequencies at the asymptotic .05 level based on 1000 repetitions. 

Jump at n/2: none -.2 +.2 -.4 -^.4 -.6 -.8 



Distribution 


Test 
















normal 


Pearson 


.07 


.13 


.27 


.45 


.77 


.78 


.94 




copula 


.05 


.06 


.04 


.07 


.07 


.11 


.18 




Kendall 


.05 


.14 


.19 


.46 


.73 


.79 


.96 


^20 


Pearson 


.06 


.11 


.31 


.41 


.77 


.77 


.94 




copula 


.05 


.06 


.06 


.08 


.08 


.12 


.20 




Kendall 


.04 


.13 


.23 


.42 


.74 


.79 


.95 


^5 


Pearson 


.08 


.12 


.26 


.34 


.67 


.67 


.89 




copula 


.06 


.05 


.06 


.08 


.09 


.12 


.18 




Kendall 


.05 


.15 


.19 


.40 


.67 


.73 


.95 


h 


Pearson 


.10 


.11 


.26 


.25 


.56 


.50 


.67 




copula 


.06 


.08 


.06 


.08 


.09 


.09 


.17 




Kendall 


.05 


.12 


.18 


.34 


.62 


.67 


.90 


h 


Pearson 


.46 


.49 


.52 


.50 


.56 


.50 


.54 




copula 


.08 


.07 


.08 


.09 


.11 


.12 


.14 




Kendall 


.06 


.12 


.11 


.18 


.34 


.34 


.53 



is a sum of independent random variables. Although the sum is generally not normal 
(the summands do not satisfy the Lindeberg condition) and does not possess any higher 
moments than ((5i,£j) itself, it is, purely heuristically speaking, closer to a normal dis- 
tribution than the innovations (if these possess finite second moments). Thus we expect 
the performance of the change-points tests to be closer to that at the normal model than 
it is the case in Model 16.11 
(8) We have noted that positive jumps are generally better detected (starting from corre- 
lation 0.4). Furthermore, we note at both tables, but more clearly at Table [21 that the 
difference in power (positive jump vs. negative jump of equal height) is larger for the 
Pearson test than for the Kendall test. In Table |2] we even find that the Kendall test is 
always better at detecting negative jumps, where as Pearson is better at detecting the 
majority of the positive jumps. This is favorable for the Kendall test, since in practical 
situations one is much more likely to encounter a change in correlation from, say, to 
0.4 than from 0.4 to 0.8. This behavior is to be expected comparing (20) and (21): 



at the normal model the asymptotic relative efficiency of Kendall's tau with respect to 
Pearson's correlation coefficient increases as the absolute value of the true correlation 
decreases and reaches its maximum of 9/??^ at ^ = 0. However, we have no apparent rea- 
son why the effect is much more pronounced in the AR(1) case than in the independent 



case. 



The apparent advantages of the Pearson test for large non-zero correlations observed at Ta- 
bles [l] and |2] must be put into perspective with the following: First, in Table |2] the Pearson 
test never keeps the 0.05 significance level under the null hypothesis. Adjusting the criti- 
cal value accordingly will result in much lower rejection frequencies under the alternatives. 
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Table 3. Robustness comparison of correlation change-point tests. Serial 
independence, bivariate Gaussian distribution with q = 0.4. Different sizes and 
fractions of outliers; 500 observations. Rejection frequencies at the asymptotic 
.05 level based on 1000 repetitions. 

Outlier magnitude: [7(0,5) [7(10,100) 

Outlier fraction (%): 1 2 5 10 1 2 5 10 

Test: 



Pearson 


.09 


.40 


.99 


1.00 


1.00 


1.00 


1.00 


1.00 


copula 


.03 


.04 


.10 


.31 


.03 


.06 


.15 


.52 


Kendall 


.06 


.11 


.67 


.99 


.10 


.25 


.89 


1.00 



Second, ellipticity implies that any monotone dependence is basically linear. This intrinsic 
linearity of the elliptical model favors measures of linear dependence, but is a questionable 
assumption in practice. In particular, strong linear dependence is rarely encountered. Often, 
monotone dependence is interest rather than linear dependence. The prevalent use of linear 
correlation coefficients to measure monotone dependence is presumably due to their sim- 
plicity and amplified by their historical dominance. In alternative models, that may exhibit 
strong monotone but not necessarily linear dependence, the picture is much better for the 
Kendall test. We present results for elliptical distributions due to their widespread use. 

We have noticed that in all simulations the copula test has a very low power and is 
outperformed by the Kendall test. Besides the applicability for heavy tailed data the second 
motivation for introducing both tests, copula and Kendall, is the lack of robustness of the 
Pearson test. The question remains how both test compare with respect to their robustness 
properties. We want to obtain a rough idea by simulating an outlier scenario. We use, as 



before, n = 500 and sample from the null hypothesis of Model 6.1 with Gaussian margins, i.e., 
{Xi, Yj) are independent and have constant correlation 0.4. In the second half of the sample 
we randomly replace some observations by outliers. The outliers are of the form (^j, —C,i), 
where ^j is drawn from either [7(— 5,5) (small outliers) or from L'^((— 100, — 10) U (10,100)) 
(large outliers), suggesting a strong negative correlation. The position of the outliers is a 
uniform sample without replacement from 251,. . . , 500. Although size as well as position of 
the outliers are random, structure and placement are very unfavorable for the null hypothesis. 
This outlier scenario can be considered as a worst case, atypical for contaminated data 
encountered in practice. The results are given in Table [3j The Kendall test can cope with a 
few bad outliers of the described type, but gets considerably biased as the number of outliers 
increases. The copula test can cope very well even with a large fraction of severe outliers, 
but it is debatable whether this makes up for the low efficiencies reported in Tables [T] and [2] 
Objective (C). The example processes considered so far are all absolutely regular with 



exponential decay of the mixing coefficients. However, Theorem 2^ poses the much weaker 
condition on the data process to be P-near epoch dependent on an absolutely regular process. 
In the remainder of this section we study an example of a process which is neither absolutely 
regular itself nor Li-near epoch dependent on any process (in the sense of Definition 2.2| ([il|), 
but which can be treated by our general P-NED formulation. 

Model 6.3. Let, as before, ((Xj,Fj))jg^ follow an AR(1) process 



18 H. DEHLING, D. VOGEL, M. WENDLER, AND D. WIED 



Table 4. Heavy-tailed, non-mixing process (Model 6.3). 500 observations, 
several alternatives. Empirical rejection frequencies at the asymptotic .05 
level based on 1000 repetitions. 

Jump at n/2: none -.2 +.2 -.4 +.4 -.6 -.8 



copula test: 


.04 


.03 


.05 


.09 


.09 


.19 


.36 


Kendall test: 


.05 


.38 


.36 


.89 


.99 


1.00 


1.00 



where the (5i,£i), i G Z, are independent and identically distributed. Now let (p = 1/2 and 
{Si,ei) have the following discrete distribution 

P((5„£,) = (04)) = P(((5„£,) = (i,0)) = (l-^)/4, 

P((5„£,) = (0,0)) = P(((5„£,) = (i,i)) = (l + ^)/4. 

The parameter g E [—1,1] is the moment correlation of this distribution. The Kendall's tau 
coefficient (the one symmetric around 0, cf. Remark 2.1 (pT|) of {Si,e i) is f = q/2. The process 



((Xj, Yi))i(z'z is not strongly mixing (see e.g. Ibragimov and Linnik (1971), p. 360), and hence 



not absolutely regular, but by Lemma 2.4 it is P-NED on ((5j,£i))iez with exponentially 
decreasing approximation coefficients. The pair (Xj, Yi) has the same moment correlation g 
as {6i,ei), its Kendall rank correlation is f = 2g/{3 — g"^), and the margins Xi and Yi are 
uniformly distributed on (0, 1). We simulate data from the process ((Xj,yi))jg^ with 

X, = H{X,), Y, = H{Yi}, 

where H denotes the quantile function of a Pareto (type I) distribution with shape parameter 
1/2 and location parameter 1, i.e. 

i/(x) = ^^^, XG[0,1). 

This strictly increasing transformation leaves the P-NED coefficients as well as Kendall's 
tau unchanged. The margins Xj, Y^ are Pareto distributed and have finite moments only up 
to order less than 1/2. 

The simulation set-up (including bandwidth and kernel for the variance estimation) is 
exactly the same as in Table |2} The sample size is always 500, the correlation parameter g is 
equal to 0.4 for the first 250 innovations and then jumps by one of the values given in Table 
|4j The reported rejection frequencies of the tests are based on 1000 repetitions. The results 
are comparable to those in the heavy-tailed, elliptical i.i.d. case, cf. Table [T] 

7. Data Examples 



Wied et al. (2012) analyze the dependence between the German stock index (DAX) and the 
Standard and Poor 500 (S&P 500). We apply the Kendall test and the Pearson test (with the 
same parameter choices for the variance estimation as in the simulation section) to the daily 
log returns of the two financial indices in the years 2006 through 2009 (1043 observations). 
The second half of this period covers what has been termed the Global Financial Crises. 
The processes (^|rfc - ^nO^^^^ „ and {^\fk - T'n\)^^^_^^ are depicted in Figure 3 Their 
maxima are the values of the test statistics of the Pearson and the Kendall test, respectively. 
Both tests give a p-value below 0.005, and both attain their maximum on July 14, 2008, 
at the height of the financial crisis. (Lehman Brothers filed for bankruptcy on September 
14, 2008.) The tests behave similarly and their outcome supports the assumption that the 
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Figure 3. Processes -y^lfk—fnl (grey) and -|^|Tfc — fnl (black), k = 1, . . . ,n, 
computed from log returns of DAX and S&P 500 between Jan 1, 2006, and 
Dec 31, 2009. 



dependence between both indices considered can not be assumed to be identical before and 
during the financial crisis of 2008. 

At a second data example, the difference between both tests become apparent. We consider 
the Dow Jones Industrial Average and the Nasdaq Composite in the years 1987 and 1988 
(Figure 111). The most notable feature of both time series is the heavy loss on October 
19, 1987, commonly known as Black Monday. Here we may ask in particular the question 
if the market conditions substantially changed after this date. Does the Black Monday 
constitute a break in the correlation between the two time series? The Pearson test reports 
a p-value indistinguishable from zero by machine accuracy. The underlying processes of the 
Pearson and the Kendall test are shown in Figure [5] The outcome of the Pearson test is 
determined by the peak on October 19, 1987, which is explained as follows. On October 
19, both indices suffered heavy losses, suggesting a strong positive correlation of their log 
returns. The following day the Dow Jones recovered to some small degree, whereas the 
Nasdaq experienced an even larger drop, suggesting strong negative correlation. Thus the 
process of successive sample correlations jumps up and immediately down again. 

The Kendall test gives a p-value of 0.24, indicating one can assume the correlation between 
the two time series to be constant over the observed time period. The empirical Kendall's 
tau is 0.52 prior to Black Monday, and 0.56 afterwards. Indeed, the market conditions turned 
out to be not much different from before, the DJIA even closed positive for 1987. 

The strong impact of a few or even a single extreme observation on the Pearson test 
demonstrates once more the inappropriateness of the moment correlation for heavy-tailed 
data. 



8. Conclusion 

We have presented a fluctuation test for detecting changes in the dependence between two 
time series based on Kendall's rank correlation coefficient. We have demonstrated the non- 
inferiority of the test in terms of efficiency and the clear superiority in terms of robustness 
and applicability to a similar, previously proposed test, which is based on Pearson's moment 
correlation. To allow arbitrarily heavy-tailed data and very weak assumptions concerning the 
serial dependence, we have introduced the concept of near epoch dependence in probability. 

We have studied the asymptotic behavior of the test statistic under stationarity by means 
of limit theorems for [/-statistics for weakly dependent, stationary processes. Simulations 
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Figure 4. Daily closings of Dow Jones Industrial Average and Nasdaq 
Composite from Jan 1, 1987, to Dec 31, 1988. 
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computed from log returns of DJIA and Nasdaq Composite between Jan 1, 
1987, and Dec 31, 1988. 



show that the proposed test possesses a variety of advantageous features that have not been 
discussed in this paper. It has power against gradual or fluctuating changes in the correlation, 
not only sudden jumps, as presented in Section |6j It also exhibits a much better robustness 
against heteroscedasticity than the Pearson test. 

However, a thorough, theoretical assessment of these properties as well as construct- 
ing tests that explicitly allow heteroscedasticity require the study of ^/-statistics at non- 
stationary sequences. This is mathematically rather challenging and - to our knowledge - 
not treated in the literature. It is certainly a topic of future research and goes beyond the 
scope of this paper. 

A yet unsatisfactory property of the test is the lack of finite sample accuracy in the case 
of strong serial dependence. This can be overcome by bootstrapping the test statistic using 
a block bootstrap, but the theoretical justification for such a procedure, again, holds some 
considerable mathematical challenge. 

Further future research directions include, e.g., the extension to more than two dimensions 
or guidelines for an on-line application of the test with results about the detection time of a 
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change. Another interesting research question, which is related to the one studied here, is to 
devise a robust test for detecting changes in the coherence of two time series. For example, 
a series of i.i.d. variables, shifted by one observation, is highly coherent to the original series, 
but our test, which only compares observations at the same time point, does not detect that 
type of dependence. 
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Appendix A. Proofs of Section [2] 
Proof of Lemma \2.4\ Taking fk{Z-k, ■ ■ ■ , Zk) = Yli=o of'^-ii ^6 have to show that 



P(|Xo-/.(Z.,,...,Z,)|,>£) = i^(|Ei!o"'^'li>^) 



< fCe^^lar^ 



Let |a| < 6 < 1. If, for all / G IN, 



'Zi\^<{l-h)h' 



ifc+i- 



then 



Hence 



A^i=o ' ^ - A^\ Ml - 



|fc+i' 



P 



Z^ 



1=0 



a'Zi 



1=0 



> 



\k+l 



< 5^P( |a'Z;| > 

1=0 ^ 

-5(1-6) 



1 - b)b^e 



oo 



|a'Z;| > 



'\-h)\}e 



ife+i 



(1 - h)\}e 



Ifc+l+Z 



C 



\a 



fc+i 



1=0 



al 



< Ke-'^lar'' 



for some K > 0. 



D 



Proof of Lemma \2. 5} Part (m) is straightforward. 

Part l^zp.- There are positive constants 6*1,6*2 such that 

E\X^-E{X^\^i^)\l < C^E\X,~E{X^\^\)l 

< C2E \Xq — fk (Z_fc, . . . , Zi:)\i 



(24) 
(25) 



p,k 



The first inequality (24) is due to the boundedness of Xq, and the constant C*i depends 
on its bound and p. The second inequality (25) does generally not hold with C*i = C*2. 
The conditional expectation provides the best L2 approximation, but here we consider the 
Li distance. We may, however, argue as follows: By applying Jensen's inequality for the 
conditional expectation to the convex function | ■ |i we obtain 



\E{Xo\^i,)-fk{Z_,,...,Z,)l < E(|Xo-/fc 



j-^fcJIi 



^fc 



-k h 
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Z,)-E{Xo\^\)\, 



and by taking the expectation of both sides we get 

< E\Xq — fk{Z_i:, . . . , Zk)\^ + E \fk{Z_k, 

< 2E\Xo — fk{Z_k, ■ ■ ■ ,Zk)\i ■ 
Now for any £ > we have 

E \Xo — fk {Z^k, ■ ■ ■ , Zk)\i 

= E{^\Xo- fk (Z_fc, . . . , Zfc)|i l{|Xo-/fc(Z_fc,...,Zfe)|j>e}j 

+ E{^\Xo - fk (Z_fc, . . . , Zk)\i l{|Xo-A(Z_fe,...,Zfc)|^<e}^ 

< CsP{\Xo-fk{Z,k,...,Zk)\,>e) + e < C3He)ak + e 
Combining this with ( 25 ) we arrive at 

<,, < C2aMe)ak + C^e 

By first choosing e sufficiently small and then k sufficiently large we can make the left-hand 
side arbitrarily small, and hence (X„)„g^ is Lp-NED, p > 1, on (Zri)nez- 
In particular, if condition (|3| holds, we get by taking e = Sk'- 

ap,fc < C2C^^{sk)ak + CaSfc = 0{sk). 

Part (in): Letting fk{Z_k, . . . , Zk) = E{Xq\^\) we have for every e > 0: 

.Zk\>e) < P{\X^-fk{Z_k,...,Zk)\l>en 



-P(|^0 — fk {Z-k-, 

< P I IXq — fk {Z_k, ■ 



>,fc 



Zk)\l>e^) < -E\X,-fk{Z_k,...,Zk)\l< ^^ 

By choosing ^{e) = e^^ and dk = a^^^ we have a^ — )■ as A; — )■ oo, and (X„)„g^ is hence 
P-NED on (Z„)„ez. ' □ 



Towards the proof of Theorem 2.6 we state the following lemma. 

Lemma A.l. Let {{Xi,Yi))i>i be a stationary process with Lipschitz continuous marginal 
distribution function F. The kernel /i : R^ x R^ — )• R defined by pi) satisfies the variation 
condition. 

Proof. Let {X,Y) and {X',Y') be independent copies of {Xi,Yi). Since h is an indicator 
function we have 



E 



sup \hiix,y),ix',y')) - /i ((X, F), (X',r))| 

\{x,y,x',y')-{X,Y,X',Y')\2<s 



P(fio), 



where Qq is the set of all a; G f2 for which there is a point (x, y, x', y') in the e-ball around 
(X(a;),y(w),X'(w),r(w)) G R^ such that exactly one of (X(w) - X'{uj)){Y{u) - Y'{u)) 
and (x — x')(t/ — y') is positive. Then w G fio implies {X{uj) — X'{uj),Y{uj) — Y'{uj)) G Aq 
with 

Ao = |(s,t)GR^ min{\sl\t\) < V2e\ . 

Letting e' = \/2e, 

P (^o) < P ((X - X', r - r) G Ao) < P (|X - X'\ <e') + P{\Y -Y'\< e') 

= [{Fx{t+e')-Fx{t-e'))dFx{t) + [ {FY{t+e') - Fy{t-e')) dFyit) < AL^e' 
Jr Ju 
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where Lq is the Lipschitz constant of F. Hence the variation condition is fulfilled with 
L = AV2Lo. D 



Proof of Theorem 2.6 (Asymptotic distribution ofT^). We use the t/-statistic representation 
of Kendall's tau, 



-pr Y^ h{{xi,yi),{x2,y2)) 



l<i<j<n 



where the kernel function /i : R^ x R^ — )■ R is given by M. The kernel h is bounded and 
satisfies the variation condition (Lemma A.l). By first applying a bounded, strictly increas- 
ing transformation to both margin processes (Xj)j>i and {Yi)i>i, which leaves Kendall's tau 
and hence \/nX{Tinx] — t'„)o<a<i unchanged, we conclude from Corollary 



3.4 



that 



V 



nX{Tin\] - r„)o<A<i — > 2ah{B{X))o<x<i, 



where 



(26) al = Var(/ii(Xi, Y,)) + 2 ^ Cov(/ii(Xi, Fi), h{X„ Yi)). 



i=2 



Here hi denotes the linear part of the kernel h in the Hoeffding decomposition, cf. (11). To 
convince ourselves that o"^ coincides with D^ given in (pi), we need to check that hi = ip. 
First note that r = E{h{{X,Y), (X',F')))> where {X,Y) and {X',Y') are two independent 
copies drawn from F. The first order term of the Hoeffding decomposition is then given by 



hi{x,y) = E{h{{x,y),{X,Y))) 



P{{X-x){Y-y)>0)-T 

P{X <x,Y <y) + P{X >x,Y>y)-T. 



Since the one-dimensional marginals Fx{x) = P{X < x) and FY{y) = P{Y < y) are 
continuous, we may express hi{x,y) as follows 

hi{x,y) = P{X <x,Y <y) + {l-P{X <x)-P{Y <y) + P{X <x,Y <y))-T 
= 2F{x,y)-Fx{x)-FY{y) + l-r = ^{x,y). 

Finally, by the continuous mapping theorem, applied to / i-)- sup |/|, we have 



n sup A|r[„A] - M 

0<A<1 



V 



2D sup |5(A)|, 

0<A<1 



and by noting that supo<A<i A|r[„A] - t„| - supo</,,<„ ^|ffc 



Tr, 



op{l/ ^/n)., we have proved 

D 



Proof of Theorem 2.(^ (Consistency of the variance estimator). With ip being the linear part 
of the kernel h, cf. (Il|, we have E%jj{Xi, Yi) =0. Letting 

Xn,t = n-'/^tlj{Xt,Yt), 



we know from ( 30 ) with ip in the role of gi that 



lim V V E (Xn,tXn,s 
i=l s=l 

= Var(^(Xi,ri)) + lim2V '' ^^ Cov{^{Xi,Yi),^{X,,Y^)) 



D' 



i=2 
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Applying Theorem 2.1 of de Jong and Davidson (2000) to the array {Xn^t)ne¥i,t=i,...,n yields 

"' t=l s=l V "" / 



j=l 






-5^^(x„r,)2 + -5^J^)5^^(x„F,)^(x,+,,r, 



■*+j/ 



D^ 



Parts (m) and dii| of Lemma |2.5| together ensure that the assumptions of the theorem are 
met. It remains to show that the convergence also holds if ip{Xi, Yj) is replaced by ipn,i, i-e., 






n 



j=i 






n,i+j 



D' 



i=l 



In order to prove (27) we abbreviate ip{Xi, Yj) by ^j and show 

n 



n-1 / . X n-j 

m;;E«(f Et*.'* 



n 



j=i 



n,i+j 



-Ipilp. 



i+j 



i=l 



0. 



The main tool for proving both statements (I) and (II) is Theorem 3.5 Recall 

^,= 2F{X„Y^-Fx{Xi)-FYiY^ + l-T, 

iJn,i = 2F„(Xi,F,)-Fx,„(X,)-Fy,„(F,) + l-f„. 
Theorem |3.5| states that the empirical process 

converges weakly to a two-dimensional Gaussian limit process in D(R^). By the continuous 
mapping theorem, applied to / t-)- sup |/|, we have that 

(28) Sn= sup V^\Fn{x,y) - F{x,y)\ 

also converges in distribution and is hence stochastically bounded. As a consequence of 
Theorem 3.2, it also holds y/n{fn — r) = Op{l) for n — > oo. Furthermore we note that 

(29) V^\Fx,n{^) - Fx{x)\ = limy^|F„(x,y)-F(x,y)| < S^, 

likewise for Fy_„ and Fy{x), and \ipn,i + V^il < 6 for all i and n. Analyzing the expression in 
(I) we find 

1 " . 1 " -- -^ 6 " - 

2v/^|F„(X„y,)-i^(^«,>^.)l + v^|Fx,„(X,)-Fx(X,)| 
+ v^|Fy,„(r,) - Fy{Y^\ + v^|f„ - r| 
< -5^(4^n + v^|r„-r|) = 6 (4^„ + v^|f„ - r|) = Op(l). 



4 = 1 



i=l 
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Hence 



n 
i=l 



and (I) is proved. For (II) we first note that by a very similar argumentation as above 






i=l 

The second step is then 

n-l / . X n-] 
ra-l 



< 6 (45„ + v^|f„ - r|) = Op(l). 






< 



n 

n 






K 



(24 ^„ + 6v^|f„, -r|) 



The first factor hnj \fn converges to zero, the second factor converges to Jg K,{x)dx (which is 
ensured by the existence of an integrable, monotonic dominating function of k) and is hence 
bounded. The third factor is stochastically bounded by above's considerations, and hence 
the product converges to zero in probability. Thus we have proved (II) and consequently 
(27). Finally, by Slutsky's lemma 



T 



V 



2Dn 

The proof is complete. 



sup \B{X)\. 

0<A<1 
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Appendix B. Proofs of Section [3] 



Proof of Theorem 3.2 (Invariance principle for U -statistics). From the Hoeffding decompo- 
sition we obtain 

[nX] 



2 i J 1 

\ 2 ) l<i<j<[nX] 



[nX] 



i=l 



and hence 

V^A(f/[„A] 



2^/r^A 



[nX] 



n A 



[-^] tr 



Y,9i{Xi) + 



5^ 92{X,,Xj] 

^<i<3<[nX] 



Thus the proof splits into two parts. We will show that of the two terms on the right hand 
side, the first one is the dominating term that converges to a Brownian motion, while the 
second term converges to zero uniformly in A. 

Part (1): By parts ^ and ([n]) of Lemma 2.5, (5'i(Xj))ig^ is L2-NED on (Zi)ifz'z with 
approximating constants a2,k = 0{k~^^~^'^^^'^). The process {Zi)i^'z is assumed to be abso- 
lutely regular with coefficients f3k = 0{k~^^^^^), hence it is also a-mixing with coefficients 



that decline to zero with at least the same rate. Applying Corollary 3.2 of Wooldridge and 



White] ( |1988[ ) (with gi{Xi) and Zj in the role of Zi and Fj, respectively) yields 

(W^(A))o<A<i, 



\ '=^ /0<A<1 



V 
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where ly is a standard Brownian motion and 

n 
/n •'— ^ 



n-1 



n — k , 
n 



(30) ai = Y^T\^}_^gi{Xk)\ = Var((7i(Xi)) + 2 J^ — — Cov((7i(Xi), (7i(Xfe+i)). 

k=l J k=l 

Under the conditions of Theorem |3.2[ the hmit variance 

oo 



a^ = Var(^i(Xi)) + 2 J^ Cov{gr{X,),g,{X,)) 



i=2 



is finite, cf. Theorem 2.3 of [Ibragimov (1962), and o"„ converges to a as n — > oo. Thus, by 
Slutsky's lemma we also have 

[nX] 



7s (m) S*''^*' 
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{aW{\)) 



0<A<1 



Part (2): Since 



J^ g2{X,,Xj 

l<«<i<[nA] 



< 



0<A<1- 



sup — j^\Uk[g2)\ 



l<k<n \/n 



we will show that the right-hand side converges to zero almost surely. By Lemma |2.5| (In]) 
and our assumption that (Xj)jg^ is bounded, we have that (Xj)jg^ is Li-NED on (Zj)jga 
with approximating constants ai^k = 0{k~^^~^^^), and further 



(31) EM^'^+E 



fc=i 



^i=k 



Ol, 



0{n 



l-5\ 



for any < 6 < min(l,£:/2). Dehling and Wendler (2010, Theorem 1) show that under 
condition (31) 



02 + S/2 



-Uk{92) ^ 0. 



log^' ^ k log log k 

Here we make use of the fact that along with the kernel g also its degenerate part g2 fulfills 
the variation condition. Dehling and Wendler (2010) consider [/-statistics of one-dimensional 
processes (Xjfc)fc>i, but the results holds true for [/-statistics of multivariate processes as well. 
Hence 

^^^''{92) ^ {k^oo) 



y/k 
and further 
1 



sup {k + l)Uk{g2) 

n l<k<n 



0. 



as n — i- oo. The proof is complete. 
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Proof of Corollary 3j4_ We consider the functional on D[Q, 1] given by 

x{t)^x{t)-tx{l), 0<t<l. 

This is a continuous functional, which, applied to the process on the left hand side of (12), 
yields 

Vn\{U[n\\ -0) - \y/n{Un -0) = y/n\{U[ne\ - Un)- 
Thus we may apply the continuous mapping theorem to obtain (14). D 
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Proof of Theorem 3.5 (Empirical process invariance principle). Without loss of generality, 
we can assume that X„ and Yn have a uniform distribution on the interval [0,1], other- 
wise we consider the random variables Fx{Xn) and Fy(y„). By Lemma 2.5 (m), the P-NED 
condition still holds after this transformation. By Lemma |2.5 (pi) and Proposition 2.11 of 



Borovkova, Burton, and Dehling (|2001|), have that for any s,t, the sequence 

0{k~^~^') for some e' > 0. The finite- 



(l{x„<s,y„<t})„ 



ew 



is Li-NED with approximation constants ai ^ 
dimensional convergence of the process 

follows from Theorem 4 of Borovkova et al. ( |2001 ) together with the Cramer- Wold device. 
For the tightness of the process, we have to show that for every e,6 > 0, there is a A; G IN, 
such that 



(32) limsupPl max sup \Wn{s,t) — Wn 



ki,k2 = l,...,k k-,-1 , , rei 



ki ^2 

■¥'¥ 



>6] <6. 



By the Lipschitz continuity of F, we have that \F{s,t) — F{s',t')\ < \s — s'\ + \t — t'\, so by 
the monotonicity of F and Fn, we can conclude that for s < s" < s' , t < t" < t' 

\W^{s"X) - Wn{s,t)\ < \Wn{s',t') - Wn{s,t)\ + ^{\s- s'\ + \t- t'\) . 



Let A; = 2' for an / G IN to be chosen later. Let / G IN, such that / > / and 



Instead of ( 32 ) , it suffices to show that 
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lim sup P j 

n— >oo 

For any two intervals Ji , /2 C R define 

1 / " 
Qnih xh) = -[Yl Mix.X.Wixi,} - PiiXi, Yi) G Ji X h) 

Similar to Lemma A.l one can prove that the indicator of Ji x I2 satisfies the variation 
condition (Definition 3.1 ). So we can apply Lemma 3.1 of Borovkova et al. (2001) to obtain 



EQi{h y<l2)<C (pi+^'((Xi,Fi) G Ji X I2) 



n 



-(1+5') 



for some 5' > 0. We now use a chaining technique (see for example van der Vaart and 



Wellner (1996), proof of Theorem 2.2.4, for more details) and get 
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We will treat the two summands (33) and (34) of the last line separately. By the Lipschitz 
continuity of F, we have that 
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With 6" = 6' /A, we find that the first summand (33) is bounded from above by 
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As the last series is summable, we can make it arbitrary small by choosing I, and thus 
k = 2', large enough. Since 2' < 86~^y/n and therefore I < Clogn, we have that the second 



summand (34) is bounded by 
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which converges to zero as n — )• oo. 
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Appendix C. Proofs of Section H] 
Proof of Theorem\4.l^ (Change-point identification) . The proof relies on the fact that 
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in -D[0, 1], where the deterministic function c is defined in (15). With the argmax theorem 



(van der Vaart and Wellner, 1996, Corollary 3.2.3) we have that under Assumption 4.2 
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It remains to prove (35 ). To simplify notation, we let m = [bn], write .Zj „ short for (Aj^„, Yi^n) 
and further suppress the subscript n. Assume for an instant that the Zi, i = 1, . . . ,n, are 
independent. Then we have 
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of the process of (C„(A))o<a<i and observe that it converges to c. Thus it remains to show 
that 
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also under the short-range dependence assumption of Model 4.1 Although the Z 



l,...,n are weakly dependent in the following, c„ and tn{k) are still defined as above, 
assuming independent observations. In what follows, we will prove that 
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The convergence of maxi<fc<m ^ |f"fc — C'(^)| follows along the same lines. Hence the maxi- 
mum in (37) can be extended to the range 1 <k <n and (36) follows. 

We split the difference - \% — t„(A;)| into three parts: two one-sample ^/-statistics and one 
two-sample [/-statistic. By the triangle inequality we get: 
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For the first summand (38), we have by Theorem 3.2 
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Due to our assumptions, {Zi^n)m+i<i<n is a stationary process which satisfies condition ^ 



so we also treat the second sumniand (39) by Theorem 3.2 
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For the third summand (40), we apply a two-sample Hoeffding decomposition 
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with 
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To obtain a maximal inequality, we use Theorem 2.4.1 of [Stout (1974): For random variables 
i?i, . . . ,Rn with E { X]}=fc~ -^i ) — ^1^ fo^ ^ constant Ci, we have that 
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We define the random variables Rj = Y^^i^s^Zi, Zj+m)- Without loss of generality, we 
can assume that the random variables Z^ are bounded and thus the process is Li-NED by 
Lemma 2.5. Furthermore, the kernel /13 is degenerate, so we can apply Proposition 6.2 of 



Dehling and Fried (2012) to obtain the moment bound 
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Applying (42) it follows that 
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as n — i- 0. Thus the third summand in (41 ) converges to zero in probability. As for the first 
two summands, we have that E ( ^,1]^ h2{Zj) ) < CI, since 
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converges to a finite limit as / — > oo. Hence, (42) applied to Rj = /i2(^j+m) leads to 

k 
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as n — )■ oo. Finally, | Yl^i hi{Zi) — > 0, as its variance converges to zero. We have thus 
shown (37), which completes the proof. D 

Appendix D. Misc 
Lemma D.l. Let (Xi, Fi,X2, F2) be jointly Gaussian with covariance matrix 

(I \ 
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i.e., (Xi,X2) and {Yi,Y2) are independent, and each has a bivariate normal distribution with 
correlation g. Then 



Cov(^(Xi,ri),^(X2,y2)) 
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where ip is as in Theorem 2.6. 



Proof. Letting Ui = Fx{Xi), i = 1,2, (i.e., (f/i,f/2) have the Gauss copula with correlation 
g as joint distribution function), straightforward calculus yields 

(43) Cov(V'(Xi, Fi), V'(X2, ^2)) = 2E{U,U2) {2E{U,U2) - 1) + 7- 



Croux and Dehon (2010) give the following expression for the population version s of Spear- 



man's rho at a bivariate normal distribution with correlation g: 
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a result which can be traced back to Pearson (1907). Together with (22) we deduce 
(44) i^([/if/2) = i-arcsin(|) + ^. 



Plugging (44) into (43) yields the stated result. 
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